Skip to main content

Model Card

Here is a detailed description of how cogkit supports models.

All training requirements must be strictly followed as specified in the table below, including resolution, number of frames, prompt token limit, and video length requirements.

CogVideo

Model NameCogVideoX1.5-5BCogVideoX1.5-5B-I2VCogVideoX-2BCogVideoX-5BCogVideoX-5B-I2V
Release DateNovember 8, 2024November 8, 2024August 6, 2024August 27, 2024September 19, 2024
Video Resolution (W * H) 1360 * 768Min(W, H) = 768
768 ≤ Max(W, H) ≤ 1360
Max(W, H) % 16 = 0
720 * 480
Number of FramesShould be 16N + 1 where N ≤ 10 (default 81)Should be 8N + 1 where N ≤ 6 (default 49)
Prompt LanguageEnglish
Prompt Token Limit224 Tokens226 Tokens
Video Length5 seconds or 10 seconds6 seconds
Frame Rate16 frames / second 8 frames / second
Download Link (Diffusers)🤗 HuggingFace
🤖 ModelScope
🟣 WiseModel
🤗 HuggingFace
🤖 ModelScope
🟣 WiseModel
🤗 HuggingFace
🤖 ModelScope
🟣 WiseModel
🤗 HuggingFace
🤖 ModelScope
🟣 WiseModel
🤗 HuggingFace
🤖 ModelScope
🟣 WiseModel

CogView

Model NameCogView4-6B (Latest)
Release DateMarch 4, 2025
Resolution512 ≤ (W, H) ≤ 2048
H * W ≤ 2^21
Max(W, H) % 32 = 0
Prompt LanguageEnglish,简体中文
Prompt Token Limit1024 Tokens (GLM-4-9B)
Download Link (Diffusers)🤗 HuggingFace
🤖 ModelScope
🟣 WiseModel